Comparison of Biological Significance of Biclusters of SIMBIC and SIMBIC+ Biclustering Models
نویسندگان
چکیده
Query driven Biclustering Model refers to the problem of extracting biclusters based on a query gene or query condition. The extracted biclusters consist of a set of genes and a subset of conditions that are similar to the query gene or query condition and it includes the query input also. Two approaches applied for biclustering problems are topdown and bottom-up, based on how they tackle the problems. Top-down techniques [3, 4] start with the entire gene expression matrix and iteratively partition it into smaller sub-matrices. On the other hand, bottom-up approach starts with a randomly chosen set of biclusters that are iteratively modified, usually enlarged, until no local improvement is possible. In this paper, the biological significance of biclusters extracted using two query driven models viz SIMBIC and SIMBIC+ are compared.This paper is organized as follows. Section 2 analyzes the popular MSB algorithm and section 3 introduces an improved version of MSB namely SIMBIC model and the enhanced model of SIMBIC namely SIMBIC+ is presented in section 4. The experimental analysis and the biological significance are illustrated in section 5. Index Terms Data Mining, Gene Expression Data, Biclustering, Average Correlation Value, Biological Significance, Gene Ontology
منابع مشابه
Analysis and visualization of gene expression data using biclustering: A comparative study
In the last few years the gene expression microarray technology has become a central tool in the field of functional genomics in which the expression levels of thousands of genes in a biological sample are determined in a single experiment. Several clustering and biclustering methods have been introduced to analyze the gene expression data by identifying the similar patterns and grouping genes ...
متن کاملEvaluating the statistical significance of biclusters
Biclustering (also known as submatrix localization) is a problem of high practical relevance in exploratory analysis of high-dimensional data. We develop a framework for performing statistical inference on biclusters found by score-based algorithms. Since the bicluster was selected in a data dependent manner by a biclustering or localization algorithm, this is a form of selective inference. Our...
متن کاملA Novel Coherence Measure for Discovering Scaling Biclusters from Gene Expression Data
Biclustering methods are used to identify a subset of genes that are co-regulated in a subset of experimental conditions in microarray gene expression data. Many biclustering algorithms rely on optimizing mean squared residue to discover biclusters from a gene expression dataset. Recently it has been proved that mean squared residue is only good in capturing constant and shifting biclusters. Ho...
متن کاملEfficient query-driven biclustering of gene expression data using Probabilistic Relational Models
Biclustering is an increasingly popular technique to identify gene regulatory modules that are linked to biological processes. We describe a novel method, called ProBic, that was developed within the framework of Probabilistic Relational Models (PRMs). ProBic is an efficient biclustering algorithm that simultaneously identifies a set of potentially overlapping biclusters in a gene expression da...
متن کاملEfficient Mining Frequent Closed Discriminative Biclusters by Sample-Growth: The FDCluster Approach
DNA microarray technology has generated a large number of gene expression data. Biclustering is a methodology allowing for condition set and gene set points clustering simultaneously. It finds clusters of genes possessing similar characteristics together with biological conditions creating these similarities. Almost all the current biclustering algorithms find bicluster in one microarray datase...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013